Evaluating Human Correction Quality for Machine Translation from Crowdsourcing
نویسندگان
چکیده
Machine translation (MT) technology is becoming more and more pervasive, yet the quality of MT output is still not ideal. Thus, human corrections are used to edit the output for further studies. However, how to judge the human correction might be tricky when the annotators are not experts. We present a novel way that uses cross-validation to automatically judge the human corrections where each MT output is corrected by more than one annotator. Cross-validation among corrections for the same machine translation, and among corrections from the same annotator are both applied. We get a correlation around 40% in sentence quality for Chinese-English and Spanish-English. We also evaluate the user quality as well. At last, we rank the quality of human corrections from good to bad, which enables us to set a quality threshold to make a trade-off between the scope and the quality of the corrections.
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملCrowdsourcing for Evaluating Machine Translation Quality
The recent popularity of machine translation has increased the demand for the evaluation of translations. However, the traditional evaluation approach, manual checking by a bilingual professional, is too expensive and too slow. In this study, we confirm the feasibility of crowdsourcing by analyzing the accuracy of crowdsourcing translation evaluations. We compare crowdsourcing scores to profess...
متن کاملEvaluation of Automatic Video Captioning Using Direct Assessment
We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video. Evaluating the accuracy of video captions is particularly difficult because for any given video clip there is no definitive ground truth or correct answer against which to measure. Automatic metrics for comparing automatic video captions against a manual caption such as BLEU ...
متن کاملUsing Crowdsourcing for Evaluation of Translation Quality
In recent years, a wide variety of machine translation services have emerged due to the increase on demand for multilingual communication supporting tools. Machine translation services have an advantage in being low cost, but also have an disadvantage in low translation quality. Therefore, there is a need to evaluate translations in order to predict the quality of machine translation services. ...
متن کاملMultiplying the Potential of Crowdsourcing with Machine Translation
Machine Translation (MT) is said to be the next lingua franca. With the evolution of new technologies and the capacity to produce a humungous number of written digital documents, human translators will not be able to translate documentation fast enough. However, some applications require a level of quality that is still beyond that provided by MT. Thanks to the increased capacity of communicati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011